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ABSTRACT 


The implementation of a major business transformation program in an official 
statistical agency is often designed to achieve, among other things, improvements in 
data collection efficiency, data processing methodology and data quality. However, 
achieving such improvements can, in itself, result in transitional statistical impacts 
which could be misinterpreted as real world change if they are not measured and 
handled appropriately. 


This paper describes early work to explore a range of statistical methods for 
measuring the statistical impacts which could be encountered in a survey redesign, 
using the ABS Labour Force Survey (LFS) as a case study, including: 


1. designing experiments for field trials of different questionnaires and data 
collection strategies; 


2. designing and conducting parallel collection activities such that the outgoing 
and the incoming surveys are run in parallel for a period of time to measure 
the impact of any collection changes; and 


3. refining the precision of impact measurement while implementing a new 
survey design. 


The results presented are for illustrative purposes for further development rather than 
for an actual implementation to the Australian LFS. 


State space modelling techniques have been utilised as the main approach for efficient 
impact measurement. This approach enables us to incorporate sampling error 
structure and time series intervention. The approach can also be extended to take 
advantage of other related data sources to improve impact measurement efficiency 
and accuracy. While the LFS is used as a case study, the models and methods 
developed can be extended to other surveys. 


1 Bie : So, aie 
Methodology Division, Australian Bureau of Statistics. 
Department of Statistical Methods, Statistics Netherlands and Department of Quantitative Economics, 
Maastricht University School of Business and Economics 
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1. INTRODUCTION 


It is common practice for national statistical offices to employ a repeated sampling 
scheme. This enables estimation of changes for the total aggregate (or population) as 
well as cross-sectional estimates. The time series produced under the repeated survey 
scheme over time create a basis for social, economic, environmental analysis and 
policy making. 


Any changes in survey methodology would potentially affect the continuity of the 
estimated time series. This creates difficulties for users in interpreting movements in 
data and making policy decisions, because it may not be clear if the unusual 
movements in the estimates represent real world changes or if they are measurement 
changes introduced by new or alternative methodological approaches. Therefore any 
changes in survey methodology have to be well managed, the impact of 
methodological change need to be identified, measured and adjusted, if necessary, to 
provide a coherent picture before and after the change and to mitigate the risk of 
misinterpretation of the changes. 


The Australian Bureau of Statistics (ABS) is embarking on a transformation program, 
which includes, amongst other changes, applying different collection modes for 
survey data and using different, but more efficient, sampling frames and estimation 
methods for official statistics. Whilst this transformation is expected to deliver positive 
changes to official statistics, there is a risk that such changes could have a statistical 
impact on some ABS time series. The challenge is to develop methodologies to 
measure, and where needed adjust for such statistical impacts. A general frame work 
for statistical impact measurement is described by Van den Brakel et al. (2017) 


The first and the most straightforward approach to assess impact of survey changes is 
to conduct a parallel run, i.e. to conduct the survey under the old and new approach 
simultaneously (see, for example, Van den Brakel, 2008). The outgoing and the 
incoming surveys are run in parallel for a period of time in order to collect 
information about any impact of the change. 


Various intervention analyses of time series models are also widely utilised to measure 
the possible time series discontinuities with and without utilising the prior 
information from a parallel run. For example, Van den Brakel and Krieg (2015) 
describe how a multivariate structural time series model was used to measure 
statistical impact induced by the Dutch Labour Force survey redesign. 


The challenge for measuring statistical impacts is managing the trade-off among 
different priorities including: 
e statistical accuracy required (minimum detectable impact and both type I and 
II errors) 
e operational feasibility and impacts, and 
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e Cost 

with consideration to: 

e target statistics 

e the assumption of the intra cluster correlation between the control’ and 
treatment samples 

e desired dissection of changes to be measured — net or component impacts 

e appetite for accepting more volatile published estimates during the parallel 
period if sample size is reduced 

e appetite for accepting revisions after the new survey implementation due to 
the uncertainty of the measured impact derived from small treatment sample. 


The ABS is establishing a three-phase statistical impact measurement (SIM) strategy: 


Phase 1: Experimental design and field tests for measuring the effectiveness and broad 
statistical impact of a change and for making decisions about the final design of the 
new survey process. 


Phase 2: A parallel run approach to collect data and measure statistical impacts with 


required accuracy. 


Phase 3: Implementation of changes, adjustment for statistical impacts, monitoring of 
the change process, and revision if necessary. 


This paper primarily presents initial development work for Phases 2 and 3 using state 
space modelling techniques to handle some special characteristics of the ABS Labour 
Force Survey (LFS), such as: 

e changes that are rotation group wave sensitive, and 

e the smoothing effect of composite estimates, 
while also considering: 

e the need to balance different priorities, and 

e the utilisation of statistical impact information from previous phases for further 

improvement. 


The results presented are for illustrative purposes for further development rather than 
for an actual implementation to the Australian LFS. 


Section 2 provides a brief introduction of the characteristics of the current ABS LFS 
survey and possible future changes. It also includes general structural time series 
models and their state space presentation for measuring statistical impact. Section 3 
describes the methods and models that could be used for LFS parallel run design and 
discusses simulated results. Section 4 presents a method to improve State Space 
Model (SSM) performance for short time series, and provides an alternative 


3 ‘ ee 
Control samples are also referred as regular samples to produce the regular estimates for publication. 
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embedded experimental design approach for further study. Section 5 evaluates some 
options under consideration by the ABS, and suggests a hybrid option to balance 
different priorities in terms of cost, accuracy and revisions. Section 6 discusses the 
implications of different options and future work to support the three phase SIM 
strategy. 


All the calculations reported in this paper were carried out with programs written in 
the SSM procedure in SAS, SsfPack (see Koopman et a/, 2008) and R. 


2. ABS LABOUR FORCE SURVEY 


2.1 ABS Labour Force Survey 


The Labour Force Survey (LFS) is based on a multi-stage area sample of dwellings and 
covers approximately 0.32% of the civilian population of Australia aged 15 years and 
over (ABS, 2016). Households selected for the LFS are interviewed using face-to-face, 
phone or web form each month for eight consecutive months, with one-eighth of the 
sample being replaced each month. The LFS sample can be thought of as comprising 
eight sub-samples (or rotation groups, RG hereafter ), with each sub-sample 
remaining in the survey for eight months, and one rotation group "rotating out" each 
month and being replaced by a new group "rotating in". This high overlap of 
respondents from month-to-month induces a strong serial correlation into the 
sampling errors. In addition, the replacement sample is generally selected from the 
same geographic areas as the outgoing one, as part of a representative sampling 
approach. This induces additional serial correlation in the sampling errors even for 
non-overlapping outgoing and incoming sub-samples. 


The estimation method used in the LFS is composite estimation. By exploiting the 
high correlation between overlapping samples across months, the ABS LFS composite 
estimator combines the previous six and current months’ data by applying different 
factors (also called BLUE multipliers) according to length of time in the survey (or 
waves. e.g. wave 1 is the first time in survey etc.). After these factors are applied seven 
months of data are weighted to align with current month population benchmarks. 


2.2 Reasons for measuring statistical impact using GREG estimates at 
rotation group level 


Although our interest is to measure a statistical impact at the level of composite 
estimates, there are different ways to make measurement at different levels. In order 
to achieve accurate and timeliness measurement, we propose to measure the 
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statistical impact at rotation group level because of the property of the composite 
estimator. 


The multipliers of the ABS LFS composite estimator are applied to past observations 
of GREG estimates at rotation group level to produce current LFS estimates. As a 
result, a smoothing effect will apply to any abrupt statistical impact at the current end 
of the series. To avoid such an effect, SIM modelling work needs to be conducted on 
the GREG estimates, i.e. prior to applying the composite multipliers, so that detection 
and measurement of abrupt impacts is timely and accurate. The corresponding 
impacts to the composite LFS estimates can then be derived accordingly. 


There are a number of potential changes to the LFS which must be considered and 
their statistical impacts assessed. It is not realistic to assume a Statistical impact is 
uniformly equal to all waves because the proposed changes may have different 
impacts on different waves. We refer to this as “wave sensitive” in this paper. 


The following are examples of possible future changes which could be wave sensitive. 


e Use of e-collection as the primary collection mode. Changes to respondent 
induction and the strategy for promoting web-form take-up potentially lead to a 
wave sensitive effect. 


e Changing the placement and timing of supplementary surveys in the monthly 
population survey of which the LFS is a part. This may be wave sensitive because 
supplementary survey placement is currently wave dependent. 


e Implementing responsive design follow-up, to ensure data acquisition effort is 
focused on quality samples and is cost-effective. This can potentially have more 
impact on wave one versus other waves. 


e Further wave dependent effects are possibly driven by any wave dependence in 
response profiles. 


2.3 Structural time series model at the rotation group level for measuring 
statistical impact 


Assume y,, iS a GREG estimate of the main LFS variables such as number of 
employed and number of unemployed persons from the rotation group that in the 
current month ¢ has been observed 7 times (¢=J, ..., 8) (referred to as wave 7 
hereafter). Without losing generality, the structural measurement errors for the wave 
Z at time ¢ are 


1. rotation group bias (RGB) 5, , and 


2. sampling error e,,. 
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The rotation group bias 6,,....b, reflects a permanent wave sensitive level shift 
compared to one reference wave (in this study reference wave was wave 7, therefore 
b, =0). 

The following equation‘ describes the relationship between an observed estimate Aue 
and the intervention effect (permanent level 


shift) @ with a time invariant assumption? : 


and unobserved components y,, 2, eé 


1 i,t 
Vi, b, aX, eC, 
ee ies ig ¥ + Dt}: f+: (2:1) 


Vet by AgXg es 


where y, is a true population value, lis] is the 8 dimensional identity matrix and x, is 
an intervention dummy variable denoted as 


1, if observations are obtained under the new design of wave i at time t, 
0, otherwise. 


y, can be expressed by a structural time series model (STM) 


y, =£+8,+1,, (2:2) 


where T;, S;, and J, denote the smooth trend model, seasonal model, and irregular 
which often is assumed as white noise for unexplained variation of the population 
parameter, respectively, see Durbin and Koopman (2012) for details. The sampling 
error stochastic process e, can be modelled as white noise for wave 1 (assuming no 
correlation with estimates from previous panel) 


c) 2. 
€,=U,,, u,, = NID(0,0;,) (2.3) 
as an AR(1) process for wave two 
1, 2 
= P12 0-1 FU, ,, Uy, = NID (0, 6;,,) (2.4) 


and as an AR(2) process for other waves (=3,4,...8) 


Ei. = BO 64 POE inca +Uu;,, Uzy,= NID(0, ) (2.5) 


ite 
where coefficients ¢, and ¢, and the sampling error disturbance variance o; can be 
predefined from the LFS data (see Pfeffermann et al., 1998). 


7 All the modelling work is on logarithmic scale in this paper. Standard Error (SE) is equivalent to Relative 
Standard Error (RSE) in the original scale. They may be used interchangeably in this paper. 

Further elaboration of this simple model may be needed if evidence emerges that this assumption needs 
revision. See some detailed discussion for future in Section 6 
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3. PARALLEL RUN DESIGN 


Design Considerations in the LFS Context 


The objectives of any LFS parallel run design would be to: 

e measure the direct statistical impacts induced by the ABS process change to 
the published ABS LFS outputs rather than unit level impacts, 

e identify statistical impacts in a timely manner to support statistical risk 
management, and 

e obtain accurate statistical impact measurement with a minimum treatment 
sample for the agreed accuracy level, and a feasible parallel run design. 


Working assumptions 


6 


For this study, the hypothetical accuracy criterion” is set to detect a significant 


statistical impact as follows: 


One standard error of population ’ estimates (43750 and 19500 for employed and 
unemployed respectively) with the conventional Type I and II errors less than 5% 
and 50% respectively. 


The minimum detectable impact (MDI) is defined as the size of the impact that can be 
detected based on the above stated accuracy criterion. Its value is calculated as the 
standard error of the estimated statistical impact times a multiplier which is derived 
from a set of predefined Type I and II errors. The multiplier is 1.96 for Type I and II 
errors of 5% and 50% respectively (see Section 2 of Appendix 1). The ratio of MDI to 
one standard error of the population estimate (referred as MDI ratio hereafter) 
indicates that a statistical impact measurement method is successful when its value is 
less than or equal to one. MDI ratio provides a uniform measure and makes the 
comparisons of SIM for different variables easier. 


Scope of the impacts to be measured 


The statistical impact measurement described here is primarily designed for 
identifying a permanent level shift induced by a new LFS design with additional 
consideration of sampling error properties 


The following issues were broadly considered, but put out of scope of this current 
work: 


1. Some aspects of a new LFS design may have short term transitory impacts, e.g. 
due to unfamiliarity with new operational processes etc. These types of 


There is no official accuracy as this paper is written. The hypothetical accuracy is purely for assisting 
discussion for this paper. 


Population is referred as employed or unemployed persons in the context of this paper 
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transitory impact should be mitigated by early training and better preparation, 
rather than measured as a part of a parallel run. 

2. Some changes may have permanent impacts on seasonal patterns. The 
affordable size and duration of any parallel run is unlikely to be enough to 
precisely estimate any such change. The nature of such changes would need to 
be assessed broadly and qualitatively either prior to parallel run via 
experiments and field testing or through ongoing monitoring over a longer 
period. An STM, for example, that allows for a break in the seasonal pattern 
after the changeover can be used to assess impact on seasonal patterns several 
years after the changeover. 


Parameters in parallel run design 


The following two parameters for a parallel run design are required to meet the 
accuracy criterion and operational feasibility: 

(1) _ the size of the treatment sample; and 

(2) the duration of the parallel run. 
From an operational feasibility perspective, the duration of any parallel run in this 
study is limited to less than two years. 


3.1 State Space Model formulation 


The set of equations (2.1) to (2.5) have described a general SSM framework for GREG 
estimates of LFS at the rotation group level with interventions. This model is a 
common approach in the literature to measure a statistical impact as the intervention 
component. 


However, such a conventional model has to estimate many hyper-parameters because 
it needs to estimate the “true” population y, in the structural time series model 
equation (2.2). Basically the differences between the model predicted value and 
observed value provide the source for measuring the statistical impact. Such relatively 
complicated model identification and prediction can be vulnerable to a rapid real 
world change for which the model may not be able to account for during the parallel 
run period. If our sole goal is to estimate the statistical impact rather than produce a 
“true” population estimate, the model can be simplified for a parallel run scenario. 


In the case of the LFS, the existing composite estimator will continue to be used to 
produce the “true” labour force population estimates As such the conventional 
intervention analysis can be simplified by modelling the differences between the 
estimates produced under the current design and the estimates produced under a 
new design conducted in parallel. This will reduce the risks of rapid changes or 
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outliers impacting our estimation and improve robustness by reducing the model 
complexity. We develop the difference SSM for estimating statistical impact below. 


Conceptual decomposition of a statistical impact on the GREG estimate 
Suppose a new LFS design (7) starts from time ¢,. Then for t2¢, the new GREG 
estimate for wave 7 is j(” 
We). tee be be ee 
a SP te SE] Sey) ot (3.1) 
Dae) a) (OP) lege 


where 5” and et” are RGB and sampling error of the new LFS design. 


The statistical impact for each wave is 


s(n) _ (n) (n) 
Vit Nit b, b, Qn, ey 
: = : + : (3.2) 
a(n) * (n) (1) 
Vei — V3 by by eg, — eg, 
difference in estimates difference in RGB difference in SE 


The structural changes come from 
1. a permanent level shift (LS) presented in the “difference in RGB” and 


2. adynamic sampling error change presented in the “difference in SE”. 
The “true” population y,cancels out under the difference model formulation and is, 


therefore, excluded from estimation. 


Estimating the statistical impact during a parallel run 


In practice, a new design will usually be introduced by each successive rotation group. 


Assuming a parallel run is conducted for t,<t<t, a new series $ can be 


constructed?® as 


es pt, <t<t, and wave i has a treatment sample 


a aA “ ns i ’ 8 (95) 
Vii otherwise 
De =y, +b, +X;, (De -b) i +X; , (e"” -¢,,) (3.4) 
with an intervention dummy variable x, , 
1 ¢,<t<t, and wave i has a treatment sample 
i : GO) 
0 otherwise 


8 3 ; 
Note: the sampling error for the treatment sample may be larger due to smaller sample size of the treatment 
sample. 
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Therefore, the permanent level shift (LS) @, =(2” -b) (or b!” =b, +a, in model (2.1) 


) for wave i can be estimated from the parallel run with a combined sampling error 


a) 
process 1);,=€. ~&, 


De ah =AX;, +1 Xj, St<t, (3.6) 
Assuming the sample rotation design continues, both e, , and ee follow the same AR 


(see eqns (2.3) — (2.5)) process but with a different disturbance variance Ons: ie; 


Nit = Pinit—-1 + P2Nit-2 + Sit, 6:2 = NID(O, 0/5) Oe) 


(x) 
—2corr(er’,€ )G; (euF 


ner) 2 
O; Cn Oi (eu) +o ie” uw) 


i, i,(e wv) 


(3.8) 


d1, b2 and Or (e4) CaN be estimated from the existing LFS sample design. Cena can 


be determined by the new treatment sample design. A more accurate estimate of @, 


from equations (3.6) and (3.7) can be achieved by maximising correlation p = 
corr(e™, e ) in equation (3.8). This relies on the working assumptions made earlier 
that the existing and new LFS designs have the same sampling error stochastic 
process, i.e. the same autoregressive coefficients of the AR(2) model. Therefore, 
O15 *(F, (40) ~Fivew) When p*1. 


The difference between the rotation group bias of the existing design and new design 
can be estimated from the following state space model presentation. 


Observation equation: 


AS (2) a 
Vie Vig = AX +77; Xj, i=1,---,8 (3.9) 


SHO TSHe) 
Nhe I OJ; 0 


State equation’: 


7 Without losing generality, the state equation is written as AR2 process although wave | and 2 are following 
a white noise and an AR1 process respectively. 
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Thy 1, 
UD >, 
t Os 
Where n, = ue , = ‘ 
1h, 0>, 
Ns Os 
0 0 0 0 0 0 0 0 0 0 0 0 
?, O 0 0 O 0 0 0 0 
0 ¢ 0 0 O , 0 0 0 
0,= 0 0 Oj, O,=|: 0 0 O|, 
0 0 0 0 0 0 0 O 
0 ? 0 0 0 0 0 0 
0 0 0 .- 0 ¢ O 0 00 - ?, 0 0 


Analytical solution for the parallel run parameters 

The estimated coefficients @, (i=1,---,8) are the permanent level shift and can be 
derived from the new LFS design RGB as 5” =b, +4, (i=1,--:,8) where b, is 
estimated from equation (3.9) and (3.10) for the existing LFS design. 

The null hypothesis for no statistical impact is H,: a@, =0 (i=1,---,8). Based on the 
classical statistical theory, we wish to determine the sample size needed to test 
whether the mean of the treatment samples is different to that of the control samples 
where the control is regarded as the true value and the difference is the statistical 
impact. 

Temporarily assuming the term 7, ,x,, in equation (3.9) is an N/D sampling error with 
standard deviation o,,, =o, and without considering the “systematic” sampling error 


process, we can derive the treatment sample sizes from equation (Al.1) and (A1.2) in 
Appendix 1 with u, =0 for a single month. 


For the ABS LFS, the population standard deviation o can be roughly estimated from 
the sample standard error, dg ie, g = Oe,/Nc where nc is the sample size for the 
control sample of 30000 households per month on average. However, e; is actually 
driven by the sample error disturbance, u;,, through an AR process (see eqns (2.3) — 
(2.5)). In other words, the sampling error process is partly “predictable” from the AR 
models driven by the sampling error disturbances. 


The variance of the sampling error disturbance o;,, can be rewritten in equations (3.7) 
and (3.8) as 
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where o;,and o?,,, are the variance of the control and treatment sampling errors of 
wave 7 respectively. 
i, i=1 (¢ =0, g =0) 
¥,= 1-¢, i=2 (¢ 40, ¢ =0) 
(1+¢,)[d-4) -# /(1-@), 123 (40,4, #0) 


is derived from sampling error AR process. 
kK =n,/ng is the ratio between treatment and control sample sizes. 
Considering the problem at hand is one sample of treatment samples, the variance of 
treatment samples can be derived as 
(n,/n;p)o; =0,/K. 
Replacing o/ Vn in equations (A1.1) and (A1.2) by the standard deviation of sampling 


error of treatment samples, o, / VK, we produce equation (3.12) to determine the 


treatment sample size n,. 


a. 


L 


2, 
Cat Se a, 
ny = fo. arta) where Z ee) ie (3.12) 


After considering the AR2 sampling error process and intra cluster correlation from 
(3.11), we derive the standard error of a, at any point of time ¢ 


1 20 
SE (a, = SE(6.,)=0,,=.|/y,| —+1-— jo. Jl 
(a, |x) ( yy) Ove (2 ae 3 3) 


We can also derive that the improvement is the gain 


SE(q, "YM = [y,(1+«-2pvk) (3.14) 
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in terms of the proportional reduction to the standard error of a, by considering 


sampling error process, and intra cluster correlation using the SSM model. The 


smaller the gain value implies the bigger the reduction of the standard error. This gain 
decreases with increasing intra cluster correlation. When p =O (there is no intra 


cluster correlation), the gain is ,/y(1+«). For example, when « =0.5, then the gains 
are 0.64 and 0.96 for employed and unemployed of the LFS respectively. 
For a parallel run with a treatment sample over periods {7}, the standard error of a, 


is 


KK 


where 7 is the number of times of wave 7 is observed over the periods of {7}. 


1 {1 2 
SE(a,|«,n)=,|—o;5 = 2 + 1 bay, because 6,, = NID(0,0;;) (3.15) 
; 


3.2 Simulation Study 
Equation (3.13) provides a theoretical solution to determine the standard error of the 
statistical impact { a, }. It can be used to allocate the treatment sample size by 


optimising n (the number of times each wave is included in the periods of a parallel 
run) and Kk (treatment sample size proportion to control sample size) to meet the 
statistical accuracy criteria with the predefined parameters y;, p and o,,, (which are 
specific for employed and unemployed estimates). For this simulation study we 
assume an equal sampling error for the 8 waves”, i.e. o7, = 02 (i =1,---,8). The 


sampling error disturbance variance of wave i, oj, can be derived from 


A simulation study was undertaken with the following two objectives: 


e to verify whether the theoretical solution is correct, and 
e to evaluate Kalman filter performance of the SSM on relatively short time series 
derived from a parallel run. 


Due to operational constraints, only one treatment RG can be introduced per month. 
Figure 3.1 presents graphically a 15 month parallel run scheme, with each treatment 
rotation group running in parallel for a full 8 (=15-7) months. The shaded cells 


represent treatment rotation groups. 


10 3 ee ; : . 
An alternative assumption is the equal sampling error disturbance variances. We found the two 
assumptions do not make difference for the standard error of @, because they have the same internal 


sampling error structure driven by AR2 process. 
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3.1 Scheme for 15 months parallel run 


Treatment Rotation Group 


4st gna 30 4th 5th 6 7 gh 
Rotation Rotation Rotation Rotation Rotation Rotation Rotation Rotation 
Group Group Group Group Group Group Group Group 


Month O 
Month 1 
Month 2 
Month 3 
Month 4 
Month 5 
Month 6 
Month 7 
Month 8 
Month 9 
Month 10 
Month 11 
Month 12 
Month 13 
Month 14 
Month 15 
Month 16 


100 replicates were generated for different combinations of 


e parallel durations (11, 13, 15, and 19 months), 


e treatment sample sizes (K = 30%, 50%, 80% and 100%), and 
e intra cluster correlations ( =0, 0.3, 0.5, 0.8) 


Appendix 3 gives more details on how the data were generated for this simulation 
study. 


Zero means and the known sampling errors variances for both control and treatment 
samples are used to set up the initial condition for sampling error disturbances p, for 


the Kalman filter to start with. See more details in Section 4. 


The standard errors of { a, } were estimated and found consistent over different waves 
(¢=1,2,...8) regardless of the true value of {a@,}. The top panel of Figure 3.2 shows 
the average standard error of @, (avg.se'') against different combinations of intra 
cluster correlation (Rho), parallel run duration (Period) and treatment sample size 
(Kappa). The results are consistent with our expectation for both employed and 


unemployed. That is, the larger intra cluster correlation or the longer parallel run 
duration or larger the treatment sample size, the smaller of the standard error of @,. 


11 ; ; ’ 
The estimated standard error of @, for each replicate appears consistent regardless of the waves and the 


size of @,. avg.se is calculated as the average of all replicates across 8 waves ie. 
100 


1 g A 7 
avg sé = ean >; oO; ; where oO; j is the estimated standard error of @,(wave i) for replicate j. 
x == 
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1 i. 32 
We also created a variable X = ,|— (e252! to examine its relation with the 
n K VK 
standard error of a,. The lower panel of Figure 3.2 shows the simulated results (dots) 
against the result (regression line) derived from regressing avg.se on to X. This 
graphical presentation clearly demonstrates that there is a very strong linear 
relationship between aug.se and X. 


3.2 Average SE against treatment sample size, intra cluster correlation, and parallel duration 


Average SE of estimated SIM of the Simulated Employed Average SE of estimated SIM of the Simulated Unemployed 


Data 
Data 
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2 Ss “Sk ae r 0.008 2 4 “S “Sy + 0.04 
& — & 
. ee F 0.004 
re 4 0.02 
F 0.002 
Kappa Kappa Kappa Kappa Kappa Kappa Kappa Kappa 
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0.02 + 
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T ™—T to ot rd —1—1— a a ere 
120 14 1618 12 14 «16 18 2 146 16 18 12 14 «16 (18 
Period Period 
Average SE against X of the Simulated Employed data Average SE against X of the Simulated Unemployed data 


0.008 
Ll 


Average SE 
0.006 
1 
Average SE 


Table 3.3 shows the regression analysis result. The high R* values confirm that avg.se 
can be predicted from xX. 


From this analysis, we can confidently conclude that the theoretical articulation of 
equation (3.13) is correct. It appears that the Kalman filter performed well to estimate 
a, with expected standard error. More detailed analysis on Kalman performances will 
be discussed in the next section. 
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3.3 Average SE Regress on X 


Employed Unemployed 
Coefficient of X 1.035e-02 (5.482e-05) 0.0706435 (0.0002898 ) 
Null deviance 1.6335e-03 7.6127e-02 
Residual deviance 2.1526e-06 6.0167e-05 
R?2 0.9987 0.9992 


In the context of the parallel run design, from the structure of X, we confirm that 


e Intra cluster correlation is the most powerful variable to reduce the standard 
error of estimated @,. 

e The treatment sample size is the second most important variable. When the 
treatment sample size is the same as the control sample size (ie. K= 1), it is 
the most efficient balanced design to minimise the standard error of estimated 
a,. 


L 


e The duration of parallel run is the least powerful factor among the three to 
reduce the standard error of estimated @,. 


The coefficients of X in Table 3.3 can be used with any combination of parallel run 
duration, treatment sample size and intra cluster correlation to predict the standard 
error of a, for both employed and unemployed of the LFS. Therefore, an optimised 


parallel run design can be achieved. 


However, the intra cluster correlation between control and treatment samples is 
usually unknown”, unlike the sampling error and the rotation panel design induced 
AR sampling error dynamics which can be estimated from the sample data (see 
Pfeffermann et al., 1998). 


Using LFS historical data, samples in each rotation group were split into pseudo 
control and treatment groups, and then used to estimate their intra cluster 
correlation. Appendix 2 presents some details of this study. We concluded that the 
intra cluster correlation between the control and treatment groups was reasonably 
high for employed and low for unemployed. This result is within our expectation 
because employed persons are likely to keep their status in the two groups, therefore, 
resulting in a high intra cluster correlation. In contrast, unemployed persons change 
their status more often in the two groups, therefore their intra cluster correlation is 
likely to be low. Our study demonstrated the existence of the intra cluster correlation 
within the same sample design of the LFS. 


The future ABS LFS will be based on an address register list based master sample 
rather than the current and traditional area master sample. Therefore, the intra cluster 
correlation between the control and treatment samples will not be known. However, 
this study suggests that we should explore possibilities to increase the intra cluster 


12 é ‘ : ‘ , 
The intra correlation between control and treatment samples can be established if a block experiment 
design is implemented for both control and treatment samples. See the detailed discussion in Section 4. 
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correlation between control and treatment groups for any future parallel run. One 
such idea is to design the parallel run as a randomized block design, using the first 
phase geographical areas as block variables. The details will be discussed in the next 
section. 


How the wave level statistical impact { @,} affect the overall statistical impact to the 


composite estimate is articulated in next section and appendix 4 in relation to the 
correlations among the estimated { a, }. 


4. MEASURING STATISTICAL IMPACTS WITH A SHORT TIME SERIES 
FROM A PARALLEL RUN 


4.1 Issues of Kalman filter initialisation and convergence 


When using an SSM to estimate state variables, the Kalman filter is a commonly used 
tool. It is a recursive algorithm producing optimal results in terms of mean square 
error when being applied to linear models. The Kalman filter produces filtered 
estimates of state variables at time ¢ by taking the mean and variance of that state 
conditional on observations up to time ¢-1 as an input. Therefore, initialisation at the 
start point t=0 is a necessary computational step and such initialisation may be 
accomplished in a variety approaches. If no prior information is available, a diffuse 
initialisation of the Kalman filter is commonly used for non-stationary state variables, 
which implies that the initial values have zero mean and very large variance. As a 
result, state estimation will have a transient effect in the initial phase of Kalman 
filtering before it converges to its steady state. 


With a diffuse prior, the Kalman filter estimates all the state variables, and their 
covariance matrices by using the parallel run data, even though some of the related 
information is readily available. While this approach can be useful for approximate 
exploratory work, it is not recommended for general use because it is not efficient and 
can lead to large error when the time series is short. A more efficient approach is to 
use available a-priori information in an exact initialization of the Kalman filter. 
Therefore, Kalman filter can converge more quickly and produce the state variable of 
interests with better quality and suffers less of the side effect from insufficient 


convergence. 


Faster convergence enables us to achieve accurate state estimates in a shorter parallel 
run, which is desirable from a cost and operational feasibility point of view. Because 
the SIM model is stationary with stability and observability, therefore, the 
unconditional mean and covariance can be used as the initial condition of the state 
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variables (Aoki 1987). We set the initial conditions for 


(1) the state {@,} with values of zero and large variances’’ with a known 
correlation structure.(see details in Appendix 4), 


(2) the sampling error state {e, } with zero expectations, estimated variance 
and zero correlations. 


The covariance of the Kalman filter estimated {@,} is the state covariance matrix 
which has a special form because of the special SSM structure. In appendix 4, we 
derive the analytical solution of the corresponding state correlation matrix which is 
the sum of the serial cross-correlation of sampling errors {e, }. 


Subsequently, the standard error of the statistical impact to the composite estimate 
can be approximated by applying a multiplier to the standard error at wave level. 
0.6798 and 0.8681 are the multiplier values for LFS unemployed and employed 
respectively (see the derivation details in Appendix 4). 


We evaluated whether the Kalman filter worked effectively for the SSM by simulating 
100 replicates based on the feature of unemployed of LFS with « =1, 9 =Qand each 
wave had 12 observations, ie. the treatment group had the same sample size as the 
control group; there was no intra cluster correlation between them; the duration of 
the parallel run was 19 months. 


Figure 4.1 shows how the SSM performed. The left panel plots the average standard 
error of filtered estimates of impact { @, } against the number of observed periods 7 of 
parallel data collection (= duration of parallel run minus 7) as the horizontal axis. 
SE(Proj) is the projected standard error beyond 12 observation periods. SE(RG) is 
the reference for the theoretical standard error of each wave when they reach their 
steady states. 


4.1 Standard error of filtered estimate of statistical impact at RG level for for unemployment 
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. A priori information should be used if available. As mentioned in Section 1, experimental design and field 
tests can be utilised to obtain such priori information. How it can be done is out scope of this paper. 
The projected standard error is produced by modelling empirically the simulated results up to 12 
observation periods. It illustrates the trajectory of the Kalman filter convergence property. 
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From the left panel of Figure 4.1, it is clear that the standard errors of {a@,} are 
consistent regardless of the true values of { a, }. The projected standard error (SE(RG)) 


demonstrates that the standard error of the filtered estimates will continue to 
converge to the theoretical value by increasing the number of observations over time. 


The question is whether the filtered estimate is accurate enough to meet the 
predefined accuracy criterion? (See the details in Section 3). To answer this question, 
we calculated the equivalent MDI ratios (MDIR) for the national level composite 
estimate and plotted in the right panel of Figure 4.1. The MDIR is 1.09 which is slightly 
short for reaching the predefined accuracy criterion after 12 observation periods, ie. 
19 months parallel run duration. 


It appears that under this approach 13 observations (i.e. parallel run length of 20- 
months) is required to detect one standard error impact for the national level 
composite estimate. Given that the Kalman filter approach is subject to a transient 
period, we need to look into how to improve its convergence rate. This can be 
achieved by providing better prior information for Kalman filter initialisation, for 
example the Hybrid Option in section 5 and other statistical methods from 
randomized experiments as a part of SIM, that is, to find more efficient designs for the 
parallel run. 


4.2 Sample design based experimental approach 


In Section 3, an SSM for the parallel run (eqn. 3.9 - 3.10) is proposed. The 
improvement in precision due to the positive correlation between survey errors of the 
regular survey and the treatment survey comes from the intra-cluster correlation, 
since samples are drawn from the same primary sampling units. Taking advantage of 
this intra-cluster correlation in our power calculations implies that the parallel run to 
measure a level shift {@,} must be designed as a Randomized Block Design (RBD) 
with the cross-classification of Primary Sample Units (PSU) and months as the block 
variables in the experimental design. By doing so, the variation between blocks can be 
eliminated from the variance of the statistical impact estimate. Van den Brakel and 
Renssen (2005) have studied how to design embedded experiments to detect and 
quantify possible level shifts in time series due to necessary changes to a sample 
survey, providing a safe transition from an old to a new survey design. They developed 
design-based procedures for the analysis of large scale experiments embedded in 
sample surveys, for example to estimate the effect of a redesign on the outcomes of a 
sample survey. Variance estimates derived in their paper directly utilise the gain in 
precision based on the fact that observations within blocks (PSU times month) are 
more homogeneous. Instead of estimating the variance of a contrast as the variance of 
the sum of the sampling error of the regular and treatment survey minus (two times) 
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the correlation between both errors, they directly estimate the net variance by 
removing the variation between blocks from the estimated treatment effect. 


This method has several advantages: 


1. The analysis is based on the micro-level, therefore has more degrees of 
freedom for variance estimation, and avoids the problem of short time series 
like those used in the SSM. 


2. The method is design-based which facilitates the interpretation of the results. 
This means that the method tests hypotheses about differences between 
survey estimates observed under two different approaches and does not 
assume some kind of model. An additional advantage of this is that the 
estimated statistical impacts are consistent with domains. 


The SIM phase 1 (described in Section 1) would also benefit from such an 
experimental design to meet the requirement of testing numbers of changes 
separately in the SIM phase 1. Its results can then be utilised as inputs to the SSM 
model for the parallel run. 


Further study is needed to understand the merit of this approach, how it relates to the 
SSM approach, and to evaluate its feasibility for field operations and power. 


5. DIFFERENT OPTIONS FOR MEASURING STATISTICAL IMPACT AND 
CHANGE IMPLEMENTATION 


The following types of options have been considered by the ABS as a part of this 
study. 


Option A: 100% control sample to maintain the current LFS production quality during 
the parallel run, while finding optimal combinations of the treatment sample size and 
length of parallel run. This option has the lowest risk level but would be costly. 


Option B: Reduce the size of the control sample and make the treatment sample size 
equal to the control sample size, e.g. a control and treatment group equal to 75% of 
the regular sample size of the current LFS. The aim of this balanced design is to 
estimate the statistical impact as precisely as possible at the cost of accepting more 
volatile regular LFS estimates for official publication purposes during the parallel run 
period. This option might not be accepted by external users due to increase of 
sampling error in regular survey estimates. 


Option C: Phase-in the new process with one rotation group each month at a time. 
After 8 months, the existing process is fully changed-over to the new process. This 
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strategy does not have a period of parallel data collection, thus statistical impact 
measurement fully relies on a time series model (see eqn. 2.1 — 2.5) to estimate the 
statistical impact. A potentially large revision may result and have to be accepted after 
starting the changeover. This is considered to be the highest risk option. 


The remainder of this section assesses the three options and proposes a hybrid option 
if an additional revision is acceptable 12 months after the introduction of a new LFS 
survey. 


5.1 Evaluation of the three options 


From the formulae developed in Section 3, we can calculate the parallel run 
parameters based on a given set of scenarios. We assumed the intra cluster correlation 
between control and treatment samples is zero (i.e. 9 =0) because a new LFS design 
is hypothetical at this stage. Table 5.1 shows the length of the parallel run required for 
scenarios of Option A with a 100% and 50% treatment sample (A100 and A50, 
respectively), and Option B with 75% (B75) for both control and treatment samples 
to meet the predefined accuracy criterion (See Section 3). It appears that none of the 
options can meet the defined accuracy criterition with the operational feasibility 
constrains of a parallel run shorter than 24 months. 


5.1 Sample size and length of parallel run required for Option A and B 


A100 A50 B75 
Control sample size % 100 100 75 
Treatment sample size % 100 50 75 
SE on published estimates Current Current 1.15 times larger 
Duration 32 months >44 36 months 
Risk Low Low Moderate 


Options A50O and B75 have the same total samples per month and their costs should 
be similar. However, B75 is a balanced design and is more efficient than A5O for 
measuring statistical impact. Therefore, a shorter parallel run is sufficient, at the cost 
of more (1.15 times) volatile published LFS estimates during parallel run periods 
because of sample size reduction for control samples. 


For the unemployment example with a 24 months parallel run, the estimated relative 
standard errors of a statistical impact for the composite estimates are 1.65%, 2.02% 
and 1.91% from the three options respectively. They detect one standard error (2.6%) 
statistical impact with 5% type I error, and powers of 47%, 36%, and 39% respectively. 
Another interpretation is that the three options can detect the size of statistical impact 
of MDI ratios 1.24, 1.52 and 1.43 times the current survey standard error (2.6%) with 
the pre-defined precision. 
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Phase-in (Option C) is an implementation strategy without parallel data collection and 
is not designed for accurate statistical impact measurement. This approach is of a high 
risk since there is limited opportunity to appraise the impacts before implementation. 


A one standard error statistical impact is not detectable with the required accuracy 
within the 8 months phase-in period because the statistical impact is wave sensitive 
and there are incomplete or insufficient observations of new samples. Table 5.2 
illustrates our simulation results with a one standard error statistical impact for LFS 
unemployed (2.6%), and the impact is measured at 3, 5 and 8 months for 100 
replicates with an unrealistic assumption that the impact is uniform to all the waves 


(wave insensitive impact). 


5.2 Total level shift detected by SSM across 100 replicates (Unemployed) 


Periods after the first new design Overall impact % MDI ratio 
rotation group is introduced (month) 

Simulated 2.60 

3 1.86 4.1 

5 2.23 4.0 

8 2.34 3.9 


The measured impacts at 3 and 5 months (1.856% and 2.232% respectively) are 
obviously not accurate. The MDI ratio indicates that an impact of more than 3.9 
standard error of unemployed population estimate can be detected with the required 
precision. Under this circumstance, there are two choices of how to handle the 
situation if this option was applied alone: 


1. Ignore the impact because the measured impact cannot meet the accuracy 
criterion. The statistical impact will be in the published estimates and could be 
misinterpreted as real world change. 


2. Apply an adjustment based on the estimated impact. This action is ad-hoc and 
potentially subject to large revision later. 


Neither of these choices is considered acceptable, therefore, this is a very high risk 
option. 


Table 5.2 also shows that the estimated impact is close to the real impact after the end 
of phase-in (8 months). However, the impact cannot be measured accurately even 
after 24 months. The details are presented in the next sub-section. 


5.2 Simulation Studies for Different Options and a Hybrid Option 


The advantage of a large parallel run (Option A) is that it minimises the risk for regular 
publication purposes during the changeover. If unexpected results are observed with 
the new process during the parallel run there is still the flexibility to fall back on the 
old approach. Because a large parallel run can estimate the statistical impact directly 
with high precision, another advantage is that it facilitates the implementation of the 
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new survey without further revision of the impact measurements after the 
changeover. This approach, however, is not cost effective since a significant data 
collection effort is required. 


The opposite approach (Option C) is having no parallel run and estimating the impact 
using a time series model. The major advantage of this approach is that it is 
inexpensive and avoids the additional fieldwork required for a parallel run. Skipping a 
period of parallel data collection and relying on a time series model to estimate the 
statistical impact also has several disadvantages and risks. First, it is not clear in 
advance if the time series estimates for the statistical impact will reach the required 
precision. Furthermore, estimates for the impact are unreliable directly after the 
changeover, and are likely to be revised after new observations become available 
under the new survey design. As a consequence, revisions must be expected and 
accepted. Implementing the changeover without a period of parallel data collection 
increases risk during the changeover. If the new survey design turns out to be a failure 
or has a significant impact, and it is therefore decided to fall back on the old approach, 
then there is a period where no data or less reliable data are available for the 
production of official statistics. 


An intermediate option is to have a small parallel run and combine this information 
with a time series modelling approach. We start for example with a parallel run with 
20% or 50% of the regular sample size for a period of 12 months. The results obtained 
from the parallel run could be used as a-priori information in the time series model. 
The information that becomes available under the new approach would be used in the 
time series model to further improve the precision of the impact estimates. Note that 
this option directly reduces the risk of having a period without official figures after the 
changeover and also reduces the amount of revisions. 


Simulations of the national unemployed and employed persons from the rotation 
group level estimates were conducted to illustrate the precision of the impact 
estimates'’. The simulation was run with the time series model approach without a 
parallel run and with five different parallel run scenarios of reduced sample sizes as 
summarized in Table 5.3. The standard errors in Table 5.3 refer to the statistical 
impact estimates at the rotation group level obtained with the control sample, the 
treatment sample and the specified parallel run periods. The sample size percentages 
refer to the current sample size of the regular LFS. The Minimum Detectable Impact 
(MDI) ratios were calculated for the overall composite estimates. 


15 ea ; F ‘ : : ; 
Most precision discussions are focused on the rotation group level in this section because the standard 
error of a statistical impact to the overall composite estimates can be approximated as the standard error at 
rotation level times a multiplier. 
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5.3 Different scenarios of parallel run used in the simulation*® 
Scenario Sample size Parallel Unemployed Employed 
control treatment run period Standard error MDI Standard error MDI 
sample sample ratio ratio 
month % points total % points total 
1 100% 20% 18 7.9 61620 4.05 1.08 135000 5.25 
2 100% 50% 12 5.6 43680 2.87 0.81 101250 3.94 
2 100% 20% 24 5.6 43680 2.87 0.81 101250 3.94 
3 100% 50% 18 3.9 30420 =2.00 0.54 67500 2.63 
4 100% 100% 12 3.7 29016 1.90 0.54 67500 2.63 
5 100% 25% 24 47 36348 =. 2.41 0.68 85000 3.31 
6 100% 100% 18 2.5 19500 = 1.28 


5.4 Minimum detectable impact ratio at 5% significance level and 50% power obtained with the 
time series model for different periods after the changeover for unemployed labour force (left 
panel) and employed labour force (right panel) 


Unemployed labour force Employed labour force 


a 
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The standard errors obtained with the time series model without a parallel run and 
the five different scenarios are aggregated for the different periods observed after the 
changeover to the new design. The MDI ratio values obtained directly after the parallel 
run are all greater than one except for unemployed scenario six. This suggests none 
of the parallel run results from the first five scenarios can meet the predefined 
precision. Figure 5.4 depicts the MDI ratios of the different scenarios against different 
periods after the changeover for the unemployed and the employed. For the 
unemployed a_ sixth scenario is added to illustrate what the time series model adds if 
it is applied after the full parallel run of 100%-100% for a period of 18 months. The 
MDI ratio for the scenario without a parallel run converges to a value of about 2 and 3 
for unemployed and employed respectively, which implies that under this scenario 
detecting an impact of one standard error cannot be achieved. For example, for the 
unemployed series of Scenario 1, a one standard error impact still cannot be achieved 
with the predefined accuracy criterion after 24 months. For Scenario 4, this precision 


= Note that for employed scenario 4 equals scenario 3 
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is obtained after 19 months, and for Scenario 6 it takes 11 months to achieve this 
precision. 


To illustrate the volatility of the impact estimates if there is no parallel run, i.e. Option 
C, the left panel of Figure 5.5 shows for 10 replicates how the time series model 
estimates the impact if more observations become available after the changeover in a 
rotation group. The horizontal axis depicts the number of months observed under the 
new design after the changeover. As can been seen it takes about 12 months before a 
stable estimate for the impact in a particular wave is obtained. The right panel 
contains similar estimates but now combined with the information obtained with a 
parallel run under scenario 3. The time series model further improves the impact 
estimates, whilst also the volatility of the estimates directly after the changeover is 
clearly reduced. 


5.5 Impact estimates for the unemployed for 10 replicates of one wave obtained with the time 
series model for different periods after the changeover without a parallel run (left panel) and with 
a parallel run according to Scenario 3 (right panel). The real value of the discontinuity is 15 
percentage points which is equal to 117000 unemployed 
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Figure 5.6 gives similar figures for the employed labour force. Estimates appear to be 
more stable and converge faster compared with the unemployed. 


5.6 Impact estimates for the employed for 10 replicates of one wave obtained with the time 
series model for different periods after the changeover without a parallel run (left panel) and with 
a parallel run according to Scenario 3 (right panel). The real value of the discontinuity is 5 
percentage points which is equal to 625000 employed. 
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The first part of Table 5.7 summarizes the standard errors of the impact estimates for 
the unemployed (in terms of percent points) in the separate waves (after 24 months). 
As expected the precision increases with the sample size of the parallel run. This table 
shows that without a parallel run, standard errors for the rotation groups are 
estimated with a standard error of about 3.5%. Under Scenario 1, the parallel run 
results in an impact estimate with a standard error of 7.9% (from Table 5.3). The time 
series model improves the precision to about 2.6% (last column of Table 5.7). In a 
similar way the standard error of a parallel run with a standard error of 5.6% is further 
reduced with the time series model to 2.2% (Scenario 2) and from 3.9% to 1.8% 
(Scenario 3). 


A consequence of improving the results of a relative small parallel run with a time 
series model is that the initial estimates of the statistical impact obtained with the 
parallel run are likely to be revised after, for example, a period of 12 months. Using a 
simulation study, we estimated the expected amount of revisions between the 
estimates obtained for parallel runs under the five scenarios and the time series model 
after 12 months. As expected, the size of the revisions decreases with the sample size 
of the parallel run. The expected revision (in the “Average” column of table 5.7) is 
about 5.8% under Scenario 1, 4% under Scenario 2 and 2.7% under Scenario 3. 


The final standard errors and revisions of the SIM estimates are given for the 
employed for the different scenarios in Table 5.8 in terms of percent points. 


5.7 : Standard errors for impact measurement estimates and revisions for the unemployed labour 
force after 12 months under different parallel run options in percentage points 


RG1 RG2 RG3 RG4 RG5 RG6 RG7 RG8 Average 
S.E. of 
impact 
No PR 3.18 3.35 3.45 3.52 3.58 3.63 3.66 3.69 3.5 
Scenario1 2.36 2.47 2.51 2.54 2.57 2.63 2.67 2.74 2.6 
Scenario2 2.05 2.13 2.15 2.17 2.19 2.23 2.29 2.37 2.2 
Scenario3 1.74 1.79 1.79 1.79 1.81 1.84 1.91 2.00 1.8 
Scenario4 1.70 1.75 1.74 1.75 1.77 1.80 1.86 1.95 1.8 
Scenario5 1.88 1.95 1.96 1.97 1.99 2.03 2.09 2.18 2.00 
Scenario6 1.42 1.44 1.43 1.43 1.44 1.47 1.52 1.61 1.47 
Revision 
Scenario1 5.62 6.13 5.88 6.53 4.71 5.71 5.54 5.93 5.8 
Scenario 2 3.97 4.22 3.93 4.57 3.29 3.89 3.82 4.02 4.0 
Scenario 3 2.73 2.83 2.59 3.11 2.23 2.56 2.58 2.65 2.7 
Scenario 4 2.60 2.69 2.45 2.95 2.12 2.43 2.45 2.51 2.52 
Scenario5 3.29 3.46 3.17 3.77 2.71 3.15 3.13 3.25 3.24 
Scenario6 1.69 1.72 1.55 1.88 1.38 1.54 1.58 1.56 1.61 


Revisions are calculated as the mean over the absolute value of the difference between 
the initial estimate of the parallel run and the time series estimate after 12 months 
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after the changeover. If the size of the revisions is compared with the standard error 
of the SIM, it must be concluded that the revisions are still substantial, particularly in 
the case of small parallel runs. As expected, the size of the revision decreases as the 
size of the parallel run increases. As illustrated with Scenario 6 for the unemployed 
labour force, the time series model still produces revisions after a full parallel run 
designed to observe a SIM of one standard error at a 5% significance level and a power 
level of 50%. 


5.8 : Standard errors impact measurement estimates and revisions employed labour force in 
percent points after 12 months under different parallel run options 


RG1 RG2 RG3 RG4 RGS RG6 RG7 RG8 Average 
S.E. of 
impact 
No PR 0.57 0.58 0.58 0.61 0.64 0.63 0.63 0.65 0.61 
Scenario1 0.37 0.36 0.35 0.36 0.38 0.37 0.37 0.40 0.37 
Scenario2 0.31 0.30 0.29 0.30 0.31 0.31 0.31 0.33 0.31 
Scenario3 0.25 0.23 0.22 0.22 0.23 0.23 0.24 0.26 0.27 
Scenario4 0.25 0.23 0.22 0.22 0.23 0.23 0.24 0.26 0.27 
Scenario5 0.27 0.28 0.28 0.29 0.29 0.30 0.31 0.33 0.29 
Revision 
Scenario1 0.77 0.87 0.81 0.90 0.67 0.78 0.78 0.83 0.80 
Scenario2 0.58 0.63 0.59 0.67 0.50 0.57 0.57 0.61 0.59 
Scenario3 0.38 0.41 0.37 0.44 0.32 0.36 0.36 0.37 0.38 
Scenario4 0.38 0.41 0.37 0.44 0.32 0.36 0.36 0.37 0.38 
Scenario5 0.48 0.52 0.48 0.56 0.41 0.47 0.47 0.49 0.49 


5.3 Revision analysis for the hybrid option 


The purpose of this analysis is to understand the properties of the hybrid option. This 
option uses the initial estimates of statistical impacts (¢77 SE in light grey) from a small 
parallel run, which may not be as accurate as desired, as inputs to a time series model 
(SSM) to improve the accuracy 12 months after the changeover. In particular, the 
relationships between the standard error of the initial estimates (77 SE), the standard 
error of the final statistical impacts (SE 12 months after changeover) and revision size 
12 months after changeover are explored. 


ABS e USING STATE SPACE MODELS FOR STATISTICAL IMPACT MEASUREMENT OF SURVEY REDESIGNS - A Case Study of the ABS 
Labour Force Survey e 27 


USING STATE SPACE MODELS FOR STATISTICAL IMPACT MEASUREMENT OF SURVEY REDESIGNS — A Case Study of the ABS Labour 
Force Survey 


5.9 : Comparisons of initial SE from parallel run, final SE and Revision after 12 month 
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The top panel of Figure 5.9 plots the three sets of simulated results for both 
unemployed and employed under the different scenarios presented in the last sub- 
section. The lower panel plots the revisions in circles and the fitted value along the 
line by regressing the revision sizes onto the initial SEs. 


It appears that the regression lines fit the simulated results very well for both 
employed and unemployed. Table 5.10 shows regression results and performance. 
Both coefficients of in7 SE for the unemployed and employed are 0.78, and suggest 
the hybrid option reduces about 80% of errors regardless of the quality of 7727 SE. 


5.10 Revision size regressing on ini SE 


Unemployed Employed 
Intercept -0.367054 (0.021980) -0.0395645 (0.0008405) 
ini SE 0.779785 (0.004307) 0.7774410 (0.0010967) 
Null deviance 10.9734000 1.2628e-01 
Residual deviance 0.0016739 1.0051e-06 
R?2 0.99984 0.999992 


Table 5.11 below shows the results from regressing the SE of the final estimates from 
the hybrid option 12 months after changeover onto the initial SE (¢77 SE) from the 
parallel run. The coefficients of 727 SE for the unemployed and employed series are 
0.21 and 0.18 respectively, and suggest the hybrid option still retains about 20% of the 
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errors in the final estimates after 12 months of changeover regardless of the quality of 
int SE. 


5.11 SE of final estimates regressing on ini SE 


Unemployed Employed 
Intercept 0.99578 (0.05123) 0.169087 (0.009134) 
ini SE 0.20939 (0.01004) 0.180600 (0.011918) 
Null deviance 0.8002000 0.0069333 
Residual deviance 0.0090932 0.0001187 
R? 0.988636 0.98288 


From table 5.10 and 5.11, we can see the consistent results for both unemployed and 
employed. Although the 777 SE figures are not exactly equal to Revision plus SE of 
final estimates, we can confidently conclude, from the coefficients of 777 SE, that the 
hybrid option reduces the errors by 80% over the 12 months after the changeover 
regardless the quality of the 777 SEs. The remaining approximately 20% of errors is 
still likely in the final estimates. 


6. DISCUSSION AND FURTHER WORK 


This paper presents a set of SSM models and evaluations for a range of options for 
measuring statistical impacts from a survey redesign, using the ABS Labour Force Survey 
redesigns as a case study. The paper has shown that by modelling the differences of the 
GREG estimates for control and treatment groups at rotation group level, the model for 
measuring statistical impact from a parallel run simplifies the conventional SSM 
intervention analysis. This proposed model should be more robust and has the following 
advantages over the conventional SSM intervention approach: 


e Eliminating the possible complications due to modelling the “true” population 
during parallel run periods; 


e Avoiding the smoothing effect because of the lagged composite weights; 


e Taking account of the dynamics of sample rotation induced process more 
effectively. 


e Initialising the Kalman filter with a-priori information rather than diffuse prior 
because the model is stationary and the expected variances (covariances) of states 
are used to speed up the Kalman filter convergence rate. 


Through theoretical deliberation and empirical simulation study, we now understand the 
relationship between the precision of detecting a statistical impact and: 


1. the parallel run parameters, (i.e. the intra-cluster correlation between the 
treatment and control groups, the sample size, duration of parallel run); 
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2. the effect of sampling error, (i.e. the correlation structure of statistical impacts at 
rotation group level, and how it affects the statistical impact to the final composite 
estimate), and 


3. the improvement and revision properties of the hybrid option after changeover. 


The knowledge presented in this paper articulates how the survey parallel parameters, 
the characteristics of the LFS survey, such as intra-cluster correlation and sampling errors, 
and the properties of the SSM affect the precision of an estimated statistical impact. 
Therefore, it can assist in the decision making process for managing the risk, quality and 
cost of a statistical impact measurement strategy. 


In terms of the options considered for measuring impact and implementing change, it is 
clear that a scenario without a parallel run is relatively inexpensive but has major draw 
backs in terms of the risk around the quality of published time series data (particularly 
coherence and interpretability) during the changeover period. In addition, the required 
accuracy criterion is unlikely to be achievable with this approach. For a critically 
important survey like the LFS, a large scale parallel run is required (assuming a low 
appetite for accepting statistical impacts on the series). 


There are two possibilities to reduce the costs of the parallel run. Either the precision 
goal that an impact of one standard error must be detectable is relaxed or revisions of the 
estimated impact must be acceptable. In the latter case the time series modelling 
approach can be combined with a smaller sample size for the parallel run as illustrated 
with the six different scenarios investigated in Section 5. 


For small parallel runs, there is of course, a large risk that the revision of the initial 
estimates for the SIM is substantial because the small parallel run does not produce 
precise initial estimates. This implies that the decision for making the changeover is 
based on an imprecise initial estimate. In a worst-case scenario the initial SIM estimates 
suggest a small impact and 12 month after the changeover the final SIM estimates appear 
to be substantially larger. This risk of course declines with the size of the parallel run and 
can be visualized by looking at the ratio of the revision and the standard error of the final 
SIM estimates as illustrated in Figure 5.4. 


Our study of the hybrid option suggests that useful information obtained from the SIM in 
Phase 1 activities, such as small experiments, field tests and dress rehearsals, can be used 
as priors for the SSM of a parallel run. In other words, using SSM modelling approach, 
the SIM information obtained from a current phase can be used as the priors and input to 
the SSM of the next phase. The SIM precision can be continually improved over the three 
phases. 


Further work is required to build on the learnings from this study as described in this 
paper. Some areas for future consideration may include: 
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e Use of other data sources: The SSM model (eqn. 2.1 — 2.5) used for the hybrid 
option can be extended to include related data sources in a multivariate 
seemingly unrelated time series equation (SUTSE) model to improve SIM 
precision by better predicting the true population. For example, Zhang and 
Honchar (2016) used unemployment benefit claimant counts (CC) is such 
related series for LFS unemployment, and the ANZ job advertisement (ANZadv) 
and Department of Employment Internet Job Vacancy index (DoEIV]). 

e Evolving level shifts: In this paper we present results for a fixed level shift type 
of statistical impact. In reality the statistical impact induced by statistical 
program change could be evolving over time. Therefore, the current fixed 
intervention analysis needs to be extended to deal with an evolving level shift 
type of statistical impacts with, for example, a random walk model. We should 
at least use the evolving level shift model to test if the intervention is fixed. 

e Kalman filter initialisation: The study presented in this paper demonstrated 
that the performance of the parallel run SSM can be further improved with a- 
priori information in an exact initialization of the Kalman. Therefore, it is 
recommended that further study is needed to maximise the precision of a 
parallel run / intra cluster correlation by taking advantage of statistical methods 
from the theory of randomized experiments. Such methods would have broad 
application for SIM phases 1 and 2. With a good understanding of its 
properties, particularly in comparison to the SSM based approach, we can 
position this alternative approach properly for SIM. 

e Alternative SSM formulation: Further study is needed to explore alternative 
SSM model formulations which may utilise the historical data better to improve 
the Kalman filter convergence rate for shortening parallel run periods and to 
reduce the standard error of the estimated statistical impact. 
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APPENDIXES 


Appendix 1. Sample size needed to test one mean and the Minimum 
Detectable Impact (MDI) ratio 


1.1 Calculate sample size needed to test one mean: one-sample, two- 
sided equality 


The formulas below are useful for tests concerning whether a mean, z, is equal to a 
reference value, w,. The Null and Alternative hypotheses are 


Hy, iu =u, 
Hu #u, 
Formulas 
in 2 
Zz Z 
i -[o 1-a/2 -t] 
Bet (A1.1) 
u-Uu 
l= 2 =O 2,49) POCZ = 2.46) 4. Fe Ss 
o/ nN (A1.2) 
where 


nis sample size 
o is standard deviation, 7 in is an estimator of the standard error of u 
® is the standard Normal distribution function 


®* is the standard Normal quantile function 
a is Type I error 
B is Type II error, meaning 1—B is power 


1.2 Type | and Type Il Error rates and the Minimum Detectable Impact 
(MDI) ratio 


Consider the case where we have an estimate A of the statistical impact A arising 
from a change in measurement procedure and the standard error on A, SE(A) is 
estimated to be SE (A). 


We assume that the test statistic we will use z = a has a N(0,1) distribution under 
Hy: A =0 and a N(—=, 1) distribution under H,:A =u #0 and that we will 


SE(A) 
declare there to have been a statistical impact (reject Ho) if |Z] > Z1-¢/2 = otda- 
a/2) where ® is the standard Normal distribution function and a@ is the prescribed 
Type I error rate (probability of declaring an impact when there is none). 
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If B is the prescribed maximum Type IJ error rate when A = u (maximum probability 
of declaring no impact when the true impact is A = wu) 


u u 
Then £= Max(Prob(|z|<z,__,.,| A=d))=®| z,_, ~ ®| -z,_, i) Bad, 
B Max ( | (l-a/2) )) La/2 an La/2 cn 
U 


given a and PB we may solve this equation for ED’ In the case where a = 0.05 and 


B = 0.5 we obtain the solution aa =1.96. Note that a corollary of the formula for B 
is that sample sizes meeting the 5% type I error will automatically satisfy a power 
requirement of 50%. 


In terms of a standard measure of accuracy relevant to the standard error SE(L) of an 
original population level estimate (e.g. employed or unemployed of the LFS) L, we 


u 
define the ratio of minimum detectable impact (MDI) as SE(L) and see that to meet 


requirements we need: 


fA 06 SE(A) 
SE(L) SE(L) 


Hence if we choose u = SE (L) and require a 5% Type I error and Type II error of 50% 
(see the accuracy criterion in Section 3), then our measurement design will need to 
a, <1 (or oe) < a 0.51). 
SE(L) 


ensure that 1.96 aS = 
SE(L) 1.96 


APPENDIX 2. ESTIMATION OF INTRA CLUSTER CORRELATION OF 
EMPLOYED AND UNEMPLOYED 


2.1 Estimates generation from the historical LFS data 


The intra cluster correlation between treatment and control group sampling errors is a 
crucial parameter that influences the power of a parallel run to detect a statistical 
impact in a time series. However, this intra cluster correlation is not usually known a- 
priori. An exercise was conducted to approximately estimate its value from the 
historical LFS data by splitting the historical sample for each month into two 
subsamples then estimating the intra cluster correlation between the sampling errors 
of the two subsamples. 


First, each cluster of dwellings in the last month of estimation period (in this study — 
November 2016) was systematically split into two half-clusters (A and B), i.e. using the 
current LFS selection method but with twice the skip than was actually used in the 
LFS. Then we matched backwards, that is, the most recent month (November 2016) 
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had the entire responding sample allocated to subsamples A and B as above and this 
month was matched to the previous month (October 2016). We then continued to 
work backwards and each time we did a pairwise match of a given month with the 
previous month. This split used Primary Sample Unit (PSU) and time as the block 
variables in term of experimental design. 


The GREG estimates for employed and unemployed persons were estimated from the 
split sample sets for all eight waves at the rotation group level by calibrating the split 
sample GREG estimates to the population characteristics. The period of time used for 
correlation calculation in this study was from January 2006 till November 2016. 


2.2 A simple intra cluster correlation estimator 


Let [ 2t/ be the two estimates from a split sample (control and treatment) of a rotation 


group for wave w. y,is the current published LFS composite estimate for population y, 


of total employed and unemployed persons in original terms. Since there is no significant 
et ie i= dy, b, 


it Vowwt —S3, -b, 


5 
rotation group (RG) effect in y,, we can assume be contains only 


sampling and rotation group induced error process for the two split samples for wave w, 
y l,w,t 


where b,, is the wave bias. ‘ a are multiplied by 8 to align with published LFS 
composite estimates. 


2 

e Ww, ‘ts Ow p wP 1 wP2,w 

™ 1 tid(0,2,,) Pa -| 5 
Co wt where Pp. wP tw 2,w On 


It can be shown that ? ~ Pw? VW: 


T 
A = pa hot x Co wt 
w wi 2 T 9 
t=l Cw t=l Oo wt 


2.3 A SSM intra cluster correlation estimator 


The purpose of this note is to derive a method / model to produce an estimate of 
cross-correlation between control and treatment groups sampling errors using the 
historical LFS data. 
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Assume J, = ( Visi Din) is a vector of GREG estimates for “control” and “treatment” 
samples (related to described above subsamples A and B). Then the model for J,, 
can be written as 


Via b, evar 
Deis b, Co 14 
= feta t] i [tf (A2.1) 
Vises b, C1 
Vosj b, Cn 84 


were the signal is y, the rotation group bias 6,,...,b, and the sampling error 
components can be expressed by equations (2.1-2.5) from section 2, with the only 
difference being that the sampling error components here have double dimensions 
reflecting the fact that two sub-samples for each wave have different sampling error 
component (although the coefficients of AR model and sampling error variance are 
the same for the two splits). 


For any wave 7, the sample error covariance matrix is 


2 
2, Oe C5, Pp 


ai 


= ele (A2.2) 
oO. i Oe, a ee 


to 


where p is intra cluster correlation between treatment and control group sampling 
errors. 


The relationship of sampling error variance and sampling error disturbance variance 
can be described by an equation 


O72 = 702, (A2.3) 


where the covariance of the corresponding sampling error disturbance is 


oi, OF ah 
a= a (A2.4) 
0, 0,,,,P oO 


and a loading factor is defined as 
I i=1 (G,=0,, =0) 
i= 1-4, i=2 (g, 40,4, =0) (A2.5) 
(1+¢)10-¢,) -¢ 1/(I-¢,), 123 G #0, ¢ #0) 


If the purpose is to use the rotation group level GREG estimate to estimate the 
correlation between the sampling errors of control and treatment samples, the 
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simplest approach is to use the rotation group level GREG estimate from wave one 
because the additional GREG estimates from the other waves do not add extra value 
but increase the complicity of the model specification. 


Therefore the simplified model for estimation correlation between the sampling 
errors of control and treatment samples includes only two variables in the 


measurement equation 


Vast = l ; y, +: Cnt (A2.6) 
Yoaa PI Cons 
Note that the RGB component disappeared in the equation (A2.6) because only wave 


one is involved and therefore the RGB will be absorbed by signal y,. 


The cross correlation of the sampling error of the control and treatment samples can 
be estimated from the above model with the sampling error disturbance covariance 
matrix (A2.4) (loading factor vy =1). 


2.4 Results of correlation calculation 


The following shows the results of the cross correlation estimates with and without the 


wave bias: 


Wave bias parameter values: 


Wave 1 2 3 4 5 6 7 8 
Employed 4.657e-3 0.476e-3 0.763e-3 0.557e-3 1.078e-3 0.954e-3 O 0.240e-3 
Unemployed 8.508e-2 4.917e-2 3.833e-2 2.722e-2 0.774e-2 1.539e-2 0 -1.113e-2 


Intra cluster correlation with wave bias correction: 


Wave 1 2 3 4 5 6 7 8 Overall 


Employed 0.5759 0.3651 0.5447 0.4354 0.5513 0.5293 0.5412 0.5253 0.5085 
Unemployed 0.1293 0.1648 0.2017 0.0965 0.0940 0.0936 0.1846 0.1856 0.1437 


Intra cluster correlation without bias correction: 


Wave 1 2 3 4 5 6 7 8 Overall 


Employed 0.5759 0.3651 0.5447 0.4354 0.5513 0.5293 0.5412 0.5253 0.5085 
Unemployed 0.1293 0.1646 0.2016 0.0964 0.0940 0.0936 0.1846 0.1856 0.1437 


The inclusion of the wave bias made virtually no difference to the aggregate estimate of 
the cross correlation. 


The estimated from SSM correlation is: 
1) for employment: 55.6%, 
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2) for unemployment: ~ 0%. 


APPENDIX 3: SIMULATED DATA GENERATION 


Data generation description 


The observations from the simulated data set followed the following structure, 


& 
Vi =), + BF +e, 
and sampling error for wave i23 and t23 


ee or: g g 
c= BE 64 ay DS 5 55 +0; 


it 


5% ~NID(0,0°,) 


Note: The above equations are referenced from equation 2.1, 2.3 to 2.5 


where, 
ie€(1,2.......,8)is the wave index 


g €(1,2) is the group index where 1=control group and 2=treatment group 


t is the time period 


y, is the “true” population estimate used in the simulation for employment and 
unemployment at time ¢ 


b® is the rotation group bias (RGB) for the 7" wave of the control and treatment group, 
note that the rotation group bias is time-invariant 


e*, is the sampling error for the 7" wave of the control and treatment group at time t. It 


follows an autoregressive process of order 2, AR(2) with the disturbance term 42’, which is 
normally and independently distributed. 


“True” population estimate y, 


It was estimated by using state space models on the LFS national level estimate. The final 
estimate was obtained by excluding the standard error component in the state space 
model. 


The rotation group bias b* 


Each wave had a predefined rotation group bias value subject to a specific employment 
and unemployment simulation scenario. 


AR2 sampling error e;, 
The structure of the errors in the wave level time series is serially dependent across 
waves. For the purposes of this simulation study, e*,and 0,,, were required to satisfy a 


predefined variance covariance structure subject to a specific employment and 
unemployment simulation scenario. 
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e*, could then be generated by calculating ge" ,,, +%@!5,-.+0;, in the following 


recursive process to reflect the 8 months rotation cycle. 


For t= 1,2 3....... 
Sampling error for wave 7= mod(48)=/, control and treatment group, 


e, =O, No AR process 

Sampling error for wave wave 7= mod(¢8) =2, control and treatment group, 

, =O, 4+ 0, AR1 process 

Sampling error for wave 7= mod(¢8) =3,4,5,6,7,0 and t= 3, control and treatment group, 


g_ 4g g g 
ei, = Ber 4 oT PCr 1-9 rte O;, AR2 process 


Table A3.1 presents the some key parameters for both control and treatment samples. 


A3.1 Parameters for Simulation data generation 


Employed Unemployed 
Sample size per month 30000 
Sampling error AR1 for wave 2 0.835 0.589 
Sampling error AR2 for wave 3 to 8 0.585, 0.3 0.466, 0.208 
RGB Control 
RSE at RGB 0.94% 6.60% 
b, 0.007 0.058267930 
b, 0.001 0.019303798 
b, -0.0044 0.006714512 
b, -0.0044 0.000405143 
b. 0.0005 0.017966054 
be 0.0001 0.019514134 
b, 6) 0) 
bs 0.0002 0.046400917 
RGB Treatment 
pb,” 0.6 0.7 
bi” 0.001 0.2 
bp,” -0.001 0.2 
pb,” -0.001 -0.02 
bp,” 0.001 -0.02 
b,” 0) -0.02 
pe 6) O 
b,” -0.6 -0.68 


The following pseudo code illustrates the data simulation process 


Set RSE RG control sample 


Iterate replicates 1 to 100 
Iterate parallel run duration: 11, 13, 15 19 
Iterate Kappa: 0.3. 0.5. 0.8, 1 
Derived Treatment sample RSE from Kappa. 
Iterate intra cluster correlation: 0, 0.3, 0.5, 0.8 
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Generate both control and treatment sample 


End 
End 
End 
End 


Note: 


The simulation program required the input parameters, ¢,for wave 2, ¢, and 4¢, for 
wave 3 to wave 8, standard error of control and treatment group (reflects K), @, and 


RGB control and treatment parameters. 


In the data simulation, the AR1 parameter ¢, for wave 2 was different to the AR1 


parameter ¢, for wave 3 and beyond. 


A different seed was used to generate the white noise component in each replication 
and therefore, the simulated observations differed only in the sampling error 


component. 


APPENDIX 4: THE STATE CORRELATION MATRIX AND THE STATISTIAL 


IMPACT TO THE OVERALL COMPSITE ESTIMATES 


The SSM model equation (3.9) — (3.10) can rewritten in a condensed form as 


Observation equation: 


d,=Fa, (A4.1) 

State equation 

X, =GX, ,+@, (A4.2) 
G,, 0 

where d,=y!-y,, X,=| e, |, F=(I I 0), ,=|8,| = ,~ N(0,Q) 
Cut 0 


The AR1 model (A4.2) of state vector X, can be written as a MA form below 


ice) 
X, =GX,, +, = YG'o,, and covariance matrix can be derived as 
i=0 


T(0) = (x, x,’) = YG, [Se',,| = yo'20" (A4.3) 


i=0 
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Substituting the sampling error AR2 coefficients, and the diagonal covariance of 
sample error disturbance forQ ,, we have the analytical solution for the correlation 


matrix p, correspond to state a,. Table A4.1 and A4.2 shows the correlation matrix of 


state @, 
A4.1 State @, correlation matrix p, of LFS unemployed 

Wave 1 2 3 4 5 6 7 8 
1 1.000 0.589 0.483 0.348 0.262 
2 0.589 1.000 0.589 0.483 0.348 0.262 
3 0.483 0.589 1.000 0.589 0.483 0.348 0.262 
4 0.348 0.483 0.589 1.000 0.589 0.483 0.348 0.262 
5 0.589 1.000 0.589 0.483 0.348 
6 0.483 0.589 1.000 0.589 0.483 
7 0.348 0.483 0.589 1.000 0.589 
8 0.262 0.348 0.483 0.589 1.000 


Lagend 


lag0 ag +1 lag+2 log 3 (lag 4 (ag EES 


A4.2 State a, correlation matrix p, of LFS employed 


Wave 1 2 3 4 5 
1 1.000 0.835 0.788 0.712 0.653 
2 0.835 1.000 0.835 0.788 0.712 
3 0.788 0.835 1.000 0.835 0.788 
4 0.712 0.788 0.835 1.000 0.835 0.788 0.712 0.653 
5 0.835 1.000 0.835 0.788 0.712 
6 0.788 0.835 1.000 0.835 0.788 
7 0.712 0.788 0.835 1.000 0.835 
8 0.653 0.712 0.788 0.835 1.000 


Lagend 


iag0 ag #1 log 2 ag 3 lag = 4 (SERS 


For the LFS unemployed, it can be seen that the state correlation matrix is induced by 
the serial cross-correlation of the wave sampling errors (AR(1) parameter - AR11 = 
0.589 for wave 2; AR(2) parameters AR12 = 0.466 AR22=0.208 for wave 3-8. The 
coloured legends represent the lagged cross-correlation of the sampling error 


In other words, the state correlation matrix is the same as the sum of serial cross- 
correlations of different waves (or the sum the cross-correlation matrix of all 
orders/lags) of the sampling errors induced by the AR(2) process’’. 


i rae ; ' er: 
A Similarly, we also can show that the element of sampling error state are concurrently independent, ie. its 
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The special analytical results are crucially important in two aspects that 
(1) they can be used as a part of state prior information for Kalman filter 
initialisation and 
(2) they also demonstrate that the estimated wave level statistical impact { @,} are 


concurrently correlated while the sampling error state { e, } are concurrently 
independent. 


The variance of the overall impact to LFS composite estimates can be estimated by 


1 : . 
ae PW where P., is the covariance matrix of state a,, W is the concurrent 
composite weight vector. Let 6, denoted the standard error vector of a,, then 


P =0,p,6,,. When the elements of W are close to 1, and all the elements of 6, are 
1 ' 
the same or very similar, then aw’ PW= ge (li) p,1,.,0, - Therefore, standard 


error of the overall impact to composite estimate can be derived from the wave level 


standard error with a multiplier ee Le.) Pelie, 0.6798 and 0.8681 are the multiplier 
g? Vis) Pats 


values for LFS unemployed and employed respetively. If the elements of the state a, 
are inapproperiately assumed independent, ie. p, =I, the multiplier value is 0.3536. 


The standard error of the overall impact to the composite estimates of LFS 
unemployed and employed are likely to be under estimated by factors of 0.520 and 
0.407 respectively. 


correlation matrix is an identical matrix, while they are temporally correlated. 
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FOR MORE INFORMATION ... 


www.abs.gov.au the ABS website is the best place for data from our publications 
and information about the ABS. 


INFORMATION AND REFERRAL SERVICE 


Our consultants can help you access the full range of information published by the 
ABS that is available free of charge from our website. Information tailored to your 
needs can also be requested as a 'user pays' service. Specialists are on hand to 
help you with analytical or methodological advice. 


POST Client Services, ABS, GPO Box 796, Sydney NSW 2001 
FAX 1300 135 211 
EMAIL client.services@abs.gov.au 


PHONE 1300 135 070 


FREE ACCESS TO STATISTICS 


All ABS statistics can be downloaded free of charge from the 
ABS web site. 
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